Phantom-BTB: Improving Branch Target Buffer Performance by Leveraging the On-Chip Memory Hierarchy
ثبت نشده
چکیده
Modern processors use Branch Target Buffers (BTB) to predict the target address of branches so that they can fetch ahead in the instruction stream increasing concurrency and performance. Ideally, BTBs would be large enough to capture the entire working set of the application and small enough for fast access and practical on-chip dedicated storage. Depending on the application, these requirements are at odds. For example, commercial applications that exhibit large instruction footprints benefit from large BTBs. This work introduces a new BTB design that accommodates large instruction footprints without dedicating expensive on-chip resources. In the proposed Phantom-BTB (PBTB) design, a conventional BTB is augmented with a virtual table that collects branch target information as the application runs. The virtual table does not have fixed dedicated storage. Instead, it resides in reserved physical address space and is transparently allocated in the on-chip caches except the L1, at cache line granularity. The entries present in the virtual table are proactively prefetched and installed in the dedicated conventional BTB, thus, expanding its perceived capacity. Experimental results with commercial workloads under full-system simulation demonstrate that PBTB improves performance over a 1K-entry BTB by 5.6% on average and up to 10.3%, while using only 89 bytes of extra storage. By adding a small prefetch buffer, performance with PBTB increases to 6.8% on average and to a high of 12.3%, while the storage overhead is only 8% of a conventional 1K-entry BTB. This performance matches the improvement of a conventional 4K-entry, one-cycle access BTB, while the dedicated storage is 3.6 times smaller.
منابع مشابه
Rehashable BTB: An Adaptive Branch Target Buffer to Improve the Target Predictability of Java Code
Abstract. Java programs are increasing in popularity and prevalence on numerous platforms, including high-performance general-purpose processors. The dynamic characteristics of the Java runtime system present unique performance challenges for several aspects of microarchitecture design. In this work, we focus on the effects of indirect branches on branch target address prediction performance. R...
متن کاملThe Precomputed Branch Architecture
Accurate instruction fetch and branch prediction is increasingly important on today’s superscalar architectures. Fetch prediction is the process of determining the next instruction to request from the memory subsystem. Branch prediction is the process of predicting the likely out-come of branch instructions. A branch target buffer (BTB) is often used to provide target addresses for taken branch...
متن کاملImproving Branch Predictability in Java Processing
Java programs are becoming increasingly prevalent on numerous platforms ranging from embedded systems to enterprise servers. Dynamic translation (interpretation and compilation), frequent calls to native interface libraries or OS kernel services and abundant usage of virtual methods by Java programs can complicate the intrinsic predictability of the control flow that can be exploited by an ILP ...
متن کاملSABA: a Zero Timing Overhead Power-Aware BTB for High-Performance Processors
Modern high-performance processors access the branch target buffer (BTB) every cycle to speculate branch target addresses. This aggressive approach improves performance as it results in early identification of target addresses. However, unfortunately, such accesses, quite often, are unnecessary as there is no control flow instruction among those fetched. In this work we introduce Speculative BT...
متن کاملThe Performance of Counter- and Correlation-Based Schemes for Branch Target Buffers
Branch turget buffers, or BTBs, can be used to improve CPU performance by maintaining target and history information of previously executed branches. We present tracedriven simulation results comparing counter-based and correlation-based prediction schemes for a variety of branch target buffer sizes. We report relative performance estimates to show both the relative merits of various techniques...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003